108 results found.
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
From Owner
License:
Size:
5.8 MByte Production Status:
Existing-used
Use:
Text Mining
-
Paper title:Stochastic Tokenization with a Language Model for Neural Text Classification
-
Paper track:Long/Sentiment Analysis and Argument Mining
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tatsuya Hiraoka | ChnSentCorp | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English Japanese
Availability:
From Data Center(s)
License:
Size:
None Production Status:
Existing-used
Use:
Text Mining
-
Paper title:Stochastic Tokenization with a Language Model for Neural Text Classification
-
Paper track:Long/Sentiment Analysis and Argument Mining
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tatsuya Hiraoka | NTCIR-6 Opinion | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
ELRA
Size:
294 KByte Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:A Neural Multi-digraph Model for Chinese NER with Gazetteers
-
Paper track:Short/Information Extraction and Text Mining
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ruixue Ding | E-commerce-NER | /N |
Documentation:
Documentation in English. It can be found at the resource URL
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
MIT
Size:
500 words Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:Modeling Semantic Compositionality with Sememe Knowledge
-
Paper track:Long/Word-level Semantics
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Junjie Huang | Semantic Compositionality Degree of 500 Chinese MWE with Two Constituents | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
OpenSource
Size:
496 MByte Production Status:
Newly created-finished
Use:
Document Pair Relationship Classification
-
Paper title:Matching Article Pairs with Graphical Decomposition and Convolutions
-
Paper track:Long/Document Analysis
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bang Liu | Chinese News Same Event/Story Dataset | /N |
Documentation:
The dataset description is in our paper "Matching Article Pairs with Graphical Decomposition and Convolutions".
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
CreativeCommons
Size:
25 MByte Production Status:
Newly created-finished
Use:
Question Answering
-
Paper title:A Sentence Cloze Dataset for Chinese Machine Reading Comprehension
-
Paper track:Short paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yiming Cui | CMRC 2019 | /N |
Documentation:
None
Written
Tagger/Parser,
Language Type:
Multilingual
Languages:
Catalan Chinese Czech English German Spanish
Availability:
Freely Available
License:
CC BY-SA-NC 4.0
Size:
None Production Status:
Newly created-finished
Use:
Semantic Role Labeling
-
Paper title:Bridging the Gap in Multilingual Semantic Role Labeling: a Language-Agnostic Approach
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Simone Conia | Multi-SRL (COLING 2020) | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Gnu
Size:
2.03 MByte Production Status:
Newly created-finished
Use:
Discourse
-
Paper title:A Document-Level Neural Machine Translation Model with Dynamic Caching Guided by Theme-Rheme Information
-
Paper track:Long paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yidong Chen | Chinese Theme-Rheme Discourse Dataset | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Bilingual
Languages:
Chinese Mandarin Chinese
Availability:
Freely Available
License:
CreativeCommons
Size:
2969 sentences Production Status:
Newly created-finished
Use:
Word Sense Disambiguation
-
Paper title:Try to Substitute: An Unsupervised Chinese Word Sense Disambiguation Method Based on HowNet
-
Paper track:Short paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Fanchao Qi | New HowNet-based Word Sense Disambiguation Dataset | /N |
Documentation:
There is Chinese documentation which is not publicly available yet.
Written
Corpus,
Language Type:
Bilingual
Languages:
Chinese English
Availability:
Freely Available
License:
Creative Commons Attribution-NonCommercial-NoDerivs 3.0
Size:
213377 sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chen Xu | IWSLT 2015 English-Chinese | /N |
Documentation:
IWSLT 2015 evaluation campaign: training/development data




